Given AMD’s recent reputation as the top manufacturer of CPUs and good stock to invest, AMD seemed like an interesting choice to try and replicate test of independence for stock price going up or down.
The hypothesis test for independence of AMD stock is as follows:
\(H_0\): The fluctuations of AMD stock are independent on whether they go up or down. So they will follow a geometric distribution
\(H_a\): The fluctuations of AMD stock are not independent on whether they go up or down.
The data used for this test was collected was from and the link to the csv file used can be found The data used for the test was the adjusted price from December 10th 2018 to December 8th 2020.
library(ggplot2)
library(plotly)
library(dplyr)
stock = data.frame(read.csv('AMD.csv'), stringsAsFactors = FALSE)
stock$Date = as.Date(stock$Date)
p = stock %>%
ggplot(aes(x = stock$Date, y = stock$Adj.Close)) +
geom_line(aes(color = "Price"), size = 1) +
xlab("Date") +
ylab("Adjusted Closing Price") +
scale_color_manual(name = "Legend", values =c(Price = "cyan")) +
theme_minimal() +
geom_area(alpha = 0.1,fill="cyan", show.legend = FALSE)
p = ggplotly(p)
p
Time series plot of each day from December 10th, 2018 to December 8th, 2020 and it’s corresponding adjusted price for the day.
To collect the data, I wrote a python script to take the adjusted closing price from the csv file. Then check if the price for each day was either an increase or a decrease from the previous day. Then collecting the amount of streaks for consecutive days without an up in stock price between 1 and 7+.
Output summary included below
import csv
prices = []
# Reading CSV file and storing data from Adj. Close
with open('AMD.csv', 'r') as AMD:
csv_reader = csv.reader(AMD)
for line in csv_reader:
prices.append(line[5])
# Converting data into floats for calculations and removing the name of the column
prices = prices[1:]
prices = [float(x) for x in prices]
price_difference = []
up_down_tracker = []
# Calculates the difference between the stock price from it's previous price.
# Can be used to see how big of the difference is between each day.
for i in range(len(prices) - 1):
price_difference.append(prices[i] - prices[i + 1])
price_difference = [round(x, 5) for x in price_difference]
# Checks the difference in stock price from the previous day.
# to determine if the stock price went up or down.
# Then creates a list representing each day as U for Up and D for down.
for i in range(len(price_difference)):
if price_difference[i] < 0:
up_down_tracker.append('D')
else:
up_down_tracker.append('U')
# Printing results
print('Down:', up_down_tracker.count('D'))
## Down: 266
print('Up:', up_down_tracker.count('U'))
## Up: 237
print('Total:', len(up_down_tracker))
## Total: 503
ratio = round(up_down_tracker.count('U') / (up_down_tracker.count('D')
+ up_down_tracker.count('U')), 4)
print('Ratio of up to total:', ratio)
## Ratio of up to total: 0.4712
print()
days_until_up = []
# Counts the amount of days until the stock price increases for each day and puts them into a list.
# If the stock price is up and is up the next day the amount of days is considered 1.
counter = 1
for i in range(0, len(up_down_tracker)):
if up_down_tracker[i] == 'D':
counter += 1
else:
days_until_up.append(counter)
counter = 1
table = [[0 for i in range(6)] for i in range(2)]
# Creates a 2x10(max of days until up) matrix with the first row representing the counter number
# and the second number representing the count.
for i in range(6):
table[0][i] = i + 1
if i == 5:
table[0][i] = '6+'
for i in range(6):
if i == 5:
table[1][5] = sum(i >= 6 for i in days_until_up)
else:
table[1][i] = days_until_up.count(i + 1)
# Adding labels for readability
table[0].insert(0, 'Streaks')
table[1].insert(0, 'Observed')
print('Table for observed results')
## Table for observed results
for line in table:
print(line)
## ['Streaks', 1, 2, 3, 4, 5, '6+']
## ['Observed', 103, 65, 38, 16, 8, 7]
print()
# Calculates the expected value using a geometric distribution
# First row is the count number
# Second row is the expected number of prices until first up price.
# Third row shows the probability of the counter number
expected = [[0 for i in range(6)] for i in range(3)]
for i in range(6):
expected[0][i] = i + 1
# Rounding values to 4 decimal places for readability
expected[1][i] = round((((1 - ratio) ** (i)) * ratio) * up_down_tracker.count('U'), 4)
if i == 5:
expected[2][i] = round(1 - (1 - (((1 - ratio) ** (i)) * ratio)),4)
expected[0][5] = '6+'
else:
expected[2][i] = round((((1 - ratio) ** (i)) * ratio), 4)
# Adding labels for readability
expected[0].insert(0, 'Streaks')
expected[1].insert(0, 'Expected')
expected[2].insert(0, 'Probability')
print('Table for expected results')
## Table for expected results
for line in expected:
print(line)
## ['Streaks', 1, 2, 3, 4, 5, '6+']
## ['Expected', 111.6744, 59.0534, 31.2274, 16.5131, 8.7321, 4.6175]
## ['Probability', 0.4712, 0.2492, 0.1318, 0.0697, 0.0368, 0.0195]
print()
print('Table for Observed & Expected')
## Table for Observed & Expected
table.append(expected[1])
for line in table:
print(line)
## ['Streaks', 1, 2, 3, 4, 5, '6+']
## ['Observed', 103, 65, 38, 16, 8, 7]
## ['Expected', 111.6744, 59.0534, 31.2274, 16.5131, 8.7321, 4.6175]
The output returns the ratio of up days to total days, the observed value, expected value, and probabilities for each streak amount.
Down: 266
Up: 237
Total: 503
Ratio of up to total: 0.4712
Table for observed results
['Streaks', 1, 2, 3, 4, 5, '6+']
['Observed', 103, 65, 38, 16, 8, 7]
Table for expected results
['Streaks', 1, 2, 3, 4, 5, '7+']
['Expected', 111.6744, 59.0534, 31.2274, 16.5131, 8.7321, 4.6175]
['Probability', 0.4712, 0.2492, 0.1318, 0.0697, 0.0368, 0.0195]
Table for Observed & Expected
['Streaks', 1, 2, 3, 4, 5, '6+']
['Observed', 103, 65, 38, 16, 8, 7]
['Expected', 111.6744, 59.0534, 31.2274, 16.5131, 8.7321, 4.6175]
observed = c(103, 65, 38, 16, 8, 7)
expected = c(111.6744, 59.0534, 31.2274, 16.5131, 8.7321, 4.6175)
price_table = as.table(rbind(observed, expected))
dimnames(price_table) = list(c('Observed','Expected'),c(1:5, '6+'))
To calculate the expected values for each day, take the ratio outputted from the code of 47.12% for the probability of an up day then find the geometric probability and finally multiply the probability by the total amount of days as can be seen with the formula.
\((1-0.4712)^{Days-1}(0.4712) \times(Up\space days) = Expected\)
Formula in context using 4 days as an example:
\((1-0.4712)^{3}(0.4712)\times(237) = 16.5131\)
Using the data collected from the output of the python code. The observed values and the expected values for the streaks of days until the price goes up and their frequencies can be in the table along with a comparison bar graph below.
## 1 2 3 4 5 6+
## Observed 103.0000 65.0000 38.0000 16.0000 8.0000 7.0000
## Expected 111.6744 59.0534 31.2274 16.5131 8.7321 4.6175
Next is to calculate the test statistic for the hypothesis test and it’s corresponding p-value.
test_stat = 0
for (i in 1:length(observed)){
test_stat = test_stat + ((observed[i] - expected[i])^2) / expected[i]
}
pval = pchisq(test_stat, length(observed) - 1, lower.tail = FALSE)
After calculating the test statistic of 4.05. For the chi-sq test with 5 degrees of freedom the p-value is 0.5425156. So it is safe to say that the stock price fluctuations are independent and to reject the alternative hypothesis.